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ABSTRACT 

Deep surveys have recently discovered galaxies at the tail end of the epoch of 
reionization. In the near future, these discoveries will be complemented by a new gen- 
eration of low-frequency radio observatories that will map the distribution of neutral 
hydrogen in the intergalactic medium through its redshifted 21cm emission. In this 
paper we calculate the expected cross-correlation between the distribution of galax- 
ies and intergalactic 21cm emission at high redshifts. We demonstrate using a simple 
model that overdense regions are expected to be ionized early as a result of their bi- 
ased galaxy formation. This early phase leads to an anti-correlation between the 21cm 
emission and the overdensities in galaxies, matter, and neutral hydrogen. Existing Lya 
surveys probe galaxies that are highly clustered in overdense regions. By comparing 
21cm emission from regions near observed galaxies to those away from observed galax- 
ies, future observations will be able to test this generic prediction and calibrate the 
ionizing luminosity of high-redshift galaxies. 

Key words: cosmology: diffuse radiation, large scale structure, theory - galaxies: 
high rcdshift, inter-galactic medium 



^ ; 1 INTRODUCTION 



An important question that should be addressed by suc- 
cessful models of reionization concerns whether overdense 
or underdense regions become ionized first. In regions that 
are overdense, galaxies will be over-abundant for two rea- 
sons; first because there is more material per unit volume to 
make galaxies, and second because small-scale fluctuations 
need to be of lower amplitude to form a galaxy when embed- 
ded in a larger-scale overdensity (the so-called galaxy bias; 
see Mo & White 1996). Regarding reionization of the inter- 
galactic medium (IGM) , the first effect will be compensated 
by the increased density of gas to be ionized. Furthermore, 
the increase in the recombination rate in overdense regions 
will be counteracted by the galaxy bias in overdense regions. 
However the latter effects need not cancel, and could either 
lead to enhanced or delayed reionization in overdense re- 
gions. 

The process of reionization also contains several layers 
of feedback. Radiative feedback heats the IGM and results 
in the suppression of low-mass galaxy formation (Efstathiou, 
1992; Thoul & Weinberg 1996; Quinn et al. 1996; Dijkstra et 
al. 2004) . This delays the completion of reionization by low- 
ering the local star formation rate, but the effect is counter- 



acted in overdense regions by the biased formation of mas- 
sive galaxies. The radiation feedback may therefore be more 
important in low-density regions where small galaxies con- 
tribute more significantly to the ionizing flux. 

In this paper we use a simple model to evaluate the 
relative significance of the above effects and compute the 
correlation between the local overdensity and quantities such 
as the hydrogen neutral-fraction, the brightness temperature 
of redshifted 21cm emission, and the overdensity of massive 
galaxies. In particular, we evaluate the prospects for finding 
whether reionization is enhanced or suppressed in overdense 
regions based on the combination of high redshift galaxy 
surveys and redshifted 21cm surveys. 

For illustration, the problem may be parameterised by 
relating the neutral fraction [x^ (8)] in regions of linear over- 
density 8 (smoothed over spheres of large radius R), to the 
average neutral fraction (xm) using a power-law with index 
(3(R) 



xm{5) = xbi (1 + 8) 



xbi [1 + P(R)6] 



(1) 



where we assume 8 <C 1. The resulting deviation in the 
brightness temperature of a region of overdensity 8 at z > 6 
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Figure 1. Left: Fluctuations in the hydrogen neutral fraction [<5jji = x hi{$) / i x ni) ~ 1] versus overdensity <5 (for R S> Rmin)- The solid, 
dot-dashed and dashed lines represent values of C = 10, C = 2 and C = 20. For comparison, we show three lines of slope d5m/d8 = —0.5, 
— 1.5 and —2.5. Right: The dependence of the predicted brightness temperature (T) on overdensity. The solid, dot-dashed and dashed 
lines represent values of C = 10, C = 2 and C = 20. For comparison, the three lines show Equation Q with values of /3 = —0.5, -1.5 
and -2.5. The upper and lower rows correspond to observations at z = 6.57 and 2 = 8 respectively. 



T(S) w 22mKa;fi [1 + -S 



22mKx H i 



i - [ - + 0{R) ) 5 



(2) 



where we have assumed hydrogen in the IGM to have a 
spin temperature well in excess of the CMB temperature. In 
Equation (|5J the pre- factor of 4/3 on the overdensity refers 
to the spherically averaged enhancement of the brightness 
temperature due to peculiar velocities in overdense regions 
(Bharadwaj & Ali 2005; Barkana & Loeb 2005). In a state 
where the ionization fraction in the IGM is independent of 
density, the value of the index is (3 = for all scales. If 
(3 < 0, then ionization is enhanced in overdense regions. 
Conversely, reionization is suppressed in overdense regions 
if f3 > 0. High redshift galaxies are preferentially located in 
large-scale regions with 8 > 0. The departure of the mean 
21cm brightness temperature in these regions from the av- 
erage IGM provides the opportunity to measure the value 
of (3, and hence to determine whether overdense or under- 
dense regions were reionized first. In §2 we describe a simple 
model that predicts j3 < 0. Regions of higher density should 
therefore be more ionized and possess reduced levels of fluc- 
tuations in redshifted 21cm emission. We later describe how 
the dependence of 21cm emission on overdensity could be 
extracted based on the fact that more massive galaxies pop- 



ulate higher density regions. Throughout the paper we adopt 
the set of cosmological parameters determined by WMAP 
(Spergel et al. 2006) for a flat ACDM universe. 



2 SIMPLE MODEL FOR THE CORRELATION 
OF 21CM EMISSION WITH LARGE-SCALE 
OVERDENSITY 

The evolution of the ionization fraction by mass Qs,r of a 
particular region of scale R with overdensity 8 (at observed 
redshift z b s ) may be written as (Wyithe & Loeb 2003) 



dQs,j 
dt 



N ion 
0.76 



dF col (S,R,z,M io 



(1-Qs 



dt 

dF col (8,R,z,Mmin) 



dt 



r ' p( '"" { < 1 + 5 D{z Z l) + 



(•3) 



where iVion is the number of photons entering the IGM 
per baryon in galaxies, c*b is the case-B recombination co- 
efficient, C is the clumping factor (which we assume, for 
simplicity, to be constant), and D(z) is the growth factor 
between redshift z and the present time. The production 
rate of ionizing photons in neutral regions is assumed to be 
proportional to the collapsed fraction F co \ of mass in ha- 
los above the minimum threshold mass for star formation 
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Figure 2. The auto-correlation function of 21cm brightness temperature smoothed on different angular scales 8. The four lines show 
the results at redshifts z = 6.5 (solid), z = 7 (short-dashed), z = 8 (long-dashed) and z = 10 (dot-dashed). The left and right panels 
correspond to clumping factors C = 10 and C = 2, respectively. 



(Af m i n ), while in ionized regions the minimum halo mass is 
limited by the Jeans mass in an ionized IGM (Afi on ). We 
assume M m in to correspond to a virial temperature of 10 4 K, 
representing the hydrogen cooling threshold, and M lon to 
correspond to a virial temperature of 10 5 K, representing the 
mass below which infall is suppressed from an ionized IGM 
(Dijkstra et al. 2004). In a region of co-moving radius R and 
mean overdensity S(z) = 6D(z)/D(z \ >B ) [specified at red- 
shift z instead of the usual z = 0] , the relevant collapsed frac- 
tion is obtained from the extended Press-Schechter (1974) 
model (Bond et al. 1991) as 

F col (8, R, z) = erfc ( & ~ 5 M == \ (4) 

V p ([a gal ] 2 - [a(R)f) J 

where erfc(:r) is the error function, a(R) is the variance of 
the density field smoothed on a scale R, and er ga i is the 
variance of the density field smoothed on a scale i? ga i , corre- 
sponding to a mass scale of M m - m or Mi on (both evaluated at 
redshift z rather than at z — 0). In this expression, the crit- 
ical linear overdensity for the collapse of a spherical top-hat 
density perturbation is 8 C ~ 1.69. The details of the galaxy 
bias and radiative feedback must be incorporated in order 
to compute how F co i and C vary with 8. Before proceed- 
ing, we stress that this model is intended to be illustrative 
only. While semi-analytic models can offer a picture of the 
global properties of reionization, the detailed processes must 
be modeled numerically using cosmological simulations (e.g. 
Zahn et al. 2006). 

Equation ||3J may be integrated as a function of 8 at the 
observed redshift. As an example, we find the value of Ni on 
that yields overlap of ionized regions at the mean density 
IGM by z ~ 6 (White et al. 2003). We then use the model 
to compute the filling fraction of ionized regions at z = 6.57 
(corresponding to the redshift of Lya galaxy surveys dis- 
cussed in the next section) and at a higher redshift of z = 8. 
The results are shown in the left panels of Figure where 
fluctuations in neutral fraction [8m = xm{S)/{xm) — 1] are 
plotted versus 8. Results are shown assuming clumping fac- 
tors of C = 10 (solid lines), C = 2 (dot-dashed lines), and 
C = 20 (dashed lines). Here the overdensities correspond to 
length scales in the IGM that are significantly in excess of 
the length scale corresponding to the minimum mass (i.e. 



Figure represents scales where [er m in] 2 — [a(R)[ 2 w 
Therefore while FigureQshows dependencies out to values of 
8 ~ 0.5, such overdensities are rare on these large scales. We 
find that overdense regions are more highly ionized than un- 
derdense regions, implying that the bias of massive galaxies 
in overdense regions dominates over the increased recom- 
bination rate there. Figure shows that fluctuations in the 
neutral fraction at a fixed overdensity (as measured at an ob- 
served redshift z bs) are smaller at earlier times. The reason 
is simply that the neutral fraction is larger at earlier times, 
so that fluctuations in ionization fraction result in smaller 
relative fluctuations in neutral fraction. The left panels of 
Figure^also show three lines of slope dSm/dS — —0.5, —1.5 
and —2.5. From comparison of these lines with our model 
prediction, we find that the value of j3 in Equations Q and 
^ is around (5 ~ —1.5 at z — 8 (for C = 10), and is signif- 
icantly steeper at lower redshifts or lower clumping factors. 
Figure suggests that a sufficiently large clumping factor 
would result in an ionization fraction that was lower in over- 
dense relative to underdense regions. 

On the right hand panels of Figure we plot the vari- 
ation of predicted 21cm brightness temperature (T) with 
overdensity. Results are again shown assuming clumping fac- 
tors of C = 10 (solid lines), C = 2 (dot-dashed lines) and 
C = 20 (dashed lines). For comparison, the three lines show 
Equation @ with values of j3 — —0.5, -1.5 and -2.5. At 
positive overdensities, we find a reduced neutral fraction. 
However in these regions, the neutral hydrogen present is 
at an increased overdensity. The brightness temperature of 
an overdense region is therefore reduced relative to the av- 
erage by increased ionization, but increased relative to av- 
erage by the higher gas density. Conversely the brightness 
temperature of an underdense region is increased relative to 
the average by reduced ionization, but decreased relative to 
average by the lower gas density. At large redshifts and/or 
clumping factor, we find that the sum of these effects results 
in a non-monotonic dependence of T on 8, with brightness 
temperature peaking near 8 = 0. 

On co-moving scales (R) sufficiently large that the vari- 
ance in the density field remains in the linear regime, we 
are able to compute the auto-correlation function [£t(#)] of 
brightness temperature smoothed with top-hat windows of 
angular radius 6 = R/dx{z), where is the angular diam- 
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Figure 3. Left panel: The number counts of Lya emitters as a function of observed luminosity TL^y. The data points are from the 
cumulative number counts of Kashikawa et al. (2006) for the SDF (incompleteness corrected). The solid line is the best fit model assuming 
Equations 17181 . with e^ c = 0.38 and / s t ar ^" = 0.24. Right panel: contours of likelihood in the f s tn r T versus £dc plane at 14%, 26% and 
64% of the maximum likelihood. These contours correspond to the 3, 2 and 1-sigma levels of a Gaussian distribution. The long and short 
dashed lines show lines of constant mass at M = 10 10 Mq and M = 10 11 Mq respectively. 



eter distance: 

$r(0) = ((T-(T)) 2 ) 1 / 2 
1 



2ira(R) 



dS(T(5) - (T)) e 



where 



(T> = 



27rcr(i?) 



d5 T{S)e 



(5) 



(6) 



and o(R) is the variance of the density field (at redshift z) 
smoothed on a scale R. The resulting auto-correlation func- 
tions are shown in Figure [5] assuming clumping factors of 
C — 10 (left) and C — 2 (right). The four curves correspond 
to redshifts of z — 6.5, 7, 8 and 10, and demonstrate (in the 
case of C = 10) that the amplitude of the auto-correlation 
function need not vary monotonically with redshift (or av- 
erage neutral fraction). 



3 THE MASSES OF LYa EMITTERS 

We would like to determine whether the distribution of 
high redshift galaxies can be combined with redshifted 21cm 
maps to probe the correlations between the redshifted 21cm 
emission and galaxies, and hence between the ionization 
fraction and the large-scale overdensity. We concentrate on 
the case of Lya emitting galaxies. In order to estimate the 
bias of Lya emitting galaxies relative to the underlying den- 
sity field, we first compare the observed abundance of Lya 
emitters to a simple model in order to estimate their galaxy 
mass. 

In the Subaru Deep Field (SDF) Kashikawa et al. (2006) 
find the density [iV bs(> TLl y )] of Lya emitters brighter 
than an observed luminosity (TLLy) at z = 6.57. Here Lhy 
is the intrinsic luminosity of the emitter and T is the trans- 
mission of Lya photons through the IGM 1 . In the lowest 
luminosity bin, TI/L y = 3 x 10 42 erg/s, and the density is 



1 Kashikawa et al. (2006) assume T = 1 when converting from 
observed flux to luminosity. 



N ohs (> TL Ly ) = 4 x 10" 4 Mpc -3 . The number counts from 
Kashikawa et al. (2006) are reproduced in Figure |^1 (for the 
case where the densities are corrected for incompleteness). 
Our simple model for Lya emitters (Haiman & Cen 2005) 
assumes the luminosity to be 



JL L , =:! ; 10 42 erg/s('^i- 



T 



Cdc 

or 



AI 



10 10 M Q 



(7) 



where edc is the duty-cycle, /star is the star-formation effi- 
ciency, M is the halo mass, and we have assumed the escape 
fraction of ionizing photons to be much smaller than unity. 
The density of Lya emitters more luminous than TLl y is 
given by 



{■ oo 

N{> TL Ly ) = e dc / dM 



dn 
dM' 



(8) 



where dn/dM is the Press-Schechter (1974) mass function 
(comoving number density per galaxy mass) with the mod- 
ification of Sheth & Tormen (2002). For different combi- 
nations of edc and fstaiT we then generate model number 
counts N(> TLLy) using Equations 171811 . These number 
counts are compared to the data from the SDF. For each 
parameter set, a likelihood is produced £Ly(edc, fstaiT) = 
11^! exp [-i(AT bs,i - N t ) 2 /af] , where 7V obs ,i and Ni are 
evaluated in the ith observed luminosity bin (i = 1 — 9), 
and Oi is the uncertainty in observed density at the ith lu- 
minosity bin. In the right panel of Figure |!|] we show con- 
tours of the 14%, 26% and 64% of the maximum likelihood, 
corresponding to the 3, 2 and 1-sigma levels of a Gaussian 
distribution. The best fit model assuming Equations 17181 
has edc = 0.38 and / s t ar T = 0.24. We expect a transmission 
of order unity (Dijkstra & Wyithe 2006). These values are 
unexpectedly high, however the model is degenerate over a 
wide range of parameter values. The best fit model is plotted 
over the data in the left panel of Figure^] We are interested 
in the mass of Lya emitters. In the right hand panel we show 
lines of constant mass corresponding to the lowest luminos- 
ity (TLl y = 3 x 10 42 erg/s) bin. The long and short dashed 
lines represent masses of M = 10 10 Af Q and M = 10 n M Q re- 
spectively. Based on these results we conclude that the halo 
masses of Lya emitters in the SDF are larger than 1O 1O M0. 
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In the remainder of this paper we show 21cm-galaxy corre- 
lations for both M = 1O 1O M and M = lO n M . 



4 CROSS-CORRELATION OF 21CM 

EMISSION WITH GALAXY PROPERTIES 

Next we discuss the cross-correlation between the number 
density of massive galaxies with 21cm emission. The ob- 
served overdensity of galaxies is simply <5 ga i = 4/3 x b(M, z)S, 
where b(M,z) is the galaxy bias, and the pre-factor of 4/3 
arises from a spherical average over the infall peculiar ve- 
locities (Kaiser 1987). The value of bias b for a halo mass 
M may be approximated using the Press-Schechter formal- 
ism (Mo & White 1996), modified to include non-spherical 
collapse (Sheth, Mo & Tormen 2001) 



b(M,z) 



1 + 



1 r 12 



+ bv 



/2(l-c) 



!/ 2c + 6(l-c)(l-c/2) 



(9) 



where v = 8l/a 2 {M), v' = y/av 



0.707, b = 0.5 and c = 



0.6. Here er(M) is the variance of the density field smoothed 
on a mass scale M at redshift z. This expression yields an 
accurate approximation to the halo bias determined from 
N-body simulations (Sheth, Mo & Tormen 2001). 

In the left-hand panels of FigureQ]we plot 21cm bright- 
ness temperature as a function of <5 ga i. Results are shown 
for two values of galaxy mass, M = 1O 1O M (solid line) and 
M = 1O 11 M (dashed line) assuming C = 10. We also show 
results for M = 1O 1O M assuming C = 2 (dot-dashed line). 
The upper row of Figure [I] shows results for z = 6.57, and 
the lower row for z = 8. 

The properties of the galaxy population correlate with 
the level of redshifted 21cm emission. These properties de- 
pend on the overdensity of the IGM whose typical fluctua- 
tion level is a function of scale. As a result the amplitude of 
the correlation between galaxy overdensity and 21cm emis- 
sion will therefore also be dependent on angular scale. In 
the right panels of Figure 2] we plot the cross-correlation 
function 

CgaiW = <5 gal x (T-(T»> 

= 1 / dS («5 gal x (T - <T») e^F (10) 

for the IGM smoothed on various angular scales, 9 = R/cIa- 
Results are again shown for two values of galaxy mass, 
M = 10 10 M Q (solid line) and M = IO^Mq (dashed 
line) assuming C = 10. As before we also show results for 
M = 1O 1O M assuming C — 2 (dot-dashed line). The lines 
show power-laws of slope d(log |£ ga i|)/d(log0) = — 1, —2 and 
—3 respectively. Since in these cases we find the brightness 
temperature to be lower in over-dense regions, we also find 
that massive galaxies correlate negatively with 21cm bright- 
ness temperature. The variance is lower on larger scales, and 
so the amplitude of the correlation is reduced. 

The results shown in Figures Q |2] and 0] are sensitive 
to the value of clumping factor. Weaker clumping leads to 
greater ionization in overdense regions, and hence a larger 
variation of ionization fraction with overdensity. This in turn 
leads to increases in the amplitudes of the auto-correlation 
and cross-correlation functions. 



On scales of 9 ~ 3' the typical fluctuation in units of 
lmK multiplied by the typical overdensity of galaxies is of 
order unity. In the next section we demonstrate that the 
brightness temperature around Lya emitters at z ~ 6 should 
be sufficiently different from the average value to be de- 
tectable by the first generation of redshifted 21cm surveys. 



5 MEASUREMENT OF CORRELATIONS 
BETWEEN LYq EMITTERS AND 
FLUCTUATING 21CM EMISSION 

The correlation between massive galaxies (which probe 
dense regions) and the 21cm signal (which traces the ge- 
ometry of the neutral gas) will offer a direct probe of the 
process of reionization. The results of § and § 2] suggest 
that Lya emitters reside in massive galaxies at high redshift, 
and that overdensities in the number counts of these galaxies 
trace the more highly ionized regions. Observationally, Lya 
surveys may provide the most straightforward comparison 
with 21cm maps for several reasons which we outline below. 
As a concrete example we consider the case of the Subaru 
Deep field and its comparison with an MWA-LFD (Mileura 
Wide Field Array-Low Frequency Demonstrator' 2 ) field. We 
can estimate the sizes of fluctuations needed for detection 
by considering the uncertainty in the 21cm brightness av- 
eraged over synthesised beams which do or do-not contain 
Lya emitting galaxies. 

Radio telescopes measure images in data cubes and an 
instrument like the MWA-LFD will have much higher reso- 
lution (spatially) along the line-of sight than perpendicular 
to the line-of-sight. This allows for the option of binning in 
frequency space so that the data cube can have equal resolu- 
tion in all three spatial dimensions. For an expected MWA- 
LFD resolution (beam radius) of 9he&m = 5 arc-minutes, this 
corresponds to a region of ±0.85MHz or ±1.6 physical Mpc 
along the line-of-sight at z — 6.5. A frequency interval of 
2 x 0.85MHz= 1.7MHz at z = 6.5 corresponds to a value of 
Az/(1 + z) = 0.008, or a redshift range of Az = 0.06. For 
an efficient cross-correlation we would therefore like galaxy 
redshifts to be known to ±0.03, which requires spectroscopy. 
Surveys for Lya emitters narrow down the redshift range 
initially via the use of narrow-band filters, and then find 
sources with a strong emission line, making spectroscopy 
attainable. 

The characteristic lengths corresponding to the ex- 
pected resolution of the MWA-LFD are well matched to the 
dimensions of surveys for Lya emitting galaxies. For exam- 
ple, the SDF has a total area of 876 square arc-minutes. In 
this field the Subaru team found 50 Lya emitter candidates. 
Spectra of 22 of these were obtained of which 16 were verified 
as high redshift galaxies, implying Ni, y ~ 36 galaxies in this 
deep field. The line-of-sight dimension of the survey volume 
is set by the width of the narrow- band filter (AA = 132A), 
centered on a mean wavelength of Ao = 9196A. This re- 
sults in a field depth of Az = (1 + «)AA/Ao = 0.11, or 
I = (cdt/dz)Az ~ 5.9Mpc. At z ~ 6 this depth is compa- 
rable to, but larger than the line-of-sight dimension corre- 
sponding to the expected MWA-LFD angular resolution. We 



see |http://www. haystack.mit.edu/ast/arr ays/mwa/index. html 
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Figure 4. Left: 21cm brightness temperature as a function of <5 ga i. Two values of galaxy mass are assumed for a clumping of C = 10, 
M = 10 10 M Q (solid line) and M = 10 11 Mq (dashed line). The dot-dashed line shows C = 2 with M = 1O 1O M . Right: The cross- 
correlation function £ ga i = (<5 ga i X (T — (T))) for the IGM smoothed on various angular scales (8). The function is presented assuming 
C = 10 for masses of M = 1O 1O M (solid line) and M = 10 11 Af Q (dashed line). The dot-dashed line represents C = 2 with M = 10 10 M o . 
The lines show power-laws of slope rf(log § ga i)/rf(log @) = ~ 1> — 2 and —3. The upper and lower rows correspond to observations at z = 6.57 
and 2 = 8 respectively. 



may therefore divide the line of sight distance into a number 
of bins iVbms, so that the size of each bin is /bin = l/Nbi n . 
We first define the angular scale that results in an angular 
beam radius (9 be am = 7? bc am/rfA, where fl boam = Zbin/2. This 
results in cylindrical volumes of space where the diameter 
equals the length of the cylinder. We are now able to find 
the number of such cylindrical regions in the SDF survey 
volume (iVtotai), as well as the number containing galaxies 
[iVg a i = Ni,y/(1 + ^Ly) where ^Ly is the excess probability 
above random of finding a second galaxy within a cylinder] , 
and the number of cylinders not containing galaxies (iVnogai). 

We next discuss the response of a phased array to the 
brightness temperature contrast of the IGM. Assuming that 
calibration can be performed ideally, and that foreground 
subtraction is perfect, the root-mean-square fluctuations in 
brightness temperature are given by the radiometer equation 

(at 2 ) 1 / 2 = - f T ;^ , (id 

where A is the wavelength, T sys is the system temperature, 
Atot the collecting area, Qh the effective solid angle of the 
synthesized beam in radians, tint is the integration time, Av 
is the size of the frequency bin, and e is a constant that 
describes the overall efficiency of the telescope. We opti- 
mistically adopt e = 1 in this paper. In units relevant for 
upcoming telescopes and at v = 200MHz, we find (Wyithe, 



Loeb & Barnes 2005) 

AT = 7.5 ( ™L) mK ( V 1 

» wr ' ^ 

Here ^4lfd is the collecting area of a phased array consisting 
of 500 tiles each with 16 cross-dipoles [the effective collect- 
ing area of an LFD tile with 4x4 cross-dipole array with 
1.07m spacing is ~ 17 - 19m 2 between 100 and 200MHz 
(B. Correy, private communication)]. The system temper- 
ature at 200MHz will be dominated by the sky and has a 
value T sys ~ 250K. Av is the frequency range over which the 
signal is smoothed and Oheam is the size of the synthesized 
beam. The value of #beam can be regarded as the radius of 
a hypothetical top-hat beam, or as the variance of a hy- 
pothetical Gaussian beam. The corresponding values of the 
constant Cbeam are 1 and 1.97 respectively. Given the noise 
per synthesised beam, we can find the noise averaged in re- 
gions with and without galaxies as AT ga i = AT/ sj 7V ga i and 
AT noga i = AT I sj TVnogai respectively. The resulting uncer- 
tainty in brightness temperature between regions with and 
without galaxies is AT diff = %J (AT ga i) 2 + (AT noga i) 2 . 

The measurement of fluctuations in brightness tempera- 
ture between over-dense and under-dense regions will require 
removal of emission from unresolved foreground objects. At 
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Figure 5. The differential (left) and cumulative (right) probability distributions for the large scale overdensity at the location of an 
observed Ly-alpha emitter. The solid, short-dashed, long-dashed and dot-dashed lines correspond to splitting up the Lyct survey into 
Nbin = 1, 2, 3 and 4 redshift bins. The upper and lower rows assume Lyo galaxy masses of 10 10 Mq and 10 11 Mq respectively. 



a fixed frequency and on scales of a few arc-minutes, these 
foregrounds are believed to vary in amplitude at a level sev- 
eral orders of magnitude in excess of the expected 21cm 
signal (Di Matteo et al. 2002). However, the foregrounds 
are expected to have smooth power-law spectra, while 21cm 
emission from the IGM will fluctuate in both space and fre- 
quency. This smoothness should allow removal through sub- 
traction of a continuum component, leaving fluctuations due 
to 21cm emission. However since the amplitude of the con- 
tinuum will be different along each line of sight, we will be 
unable to determine its absolute level. Rather, the fluctua- 
tions in 21cm emission will need to be measured relative to 
the average continuum component along each line of sight 
(about which the fluctuations in brightness temperature will 
average to zero). This average continuum must be measured 
from a region of spectrum having finite length (the band- 
pass), and will therefore be determined only to an accuracy 
corresponding to fluctuations in the mean 21cm emission 
across the whole bandpass. For the MWA-LFD the band- 
pass will be 32MHz. We may estimate the level of uncer- 
tainty introduced through subtraction of the continuum by 
considering the variance of the density field in cylinders of 
radius a few arc-minutes (the synthesised beam size), and 
line of sight length corresponding to the band-pass. This 
variance can be shown to be around a few percent, which 



should be compared to the 10-20% representing the vari- 
ance of the density field smoothed in spheres of radius a few 
arc-minutes. Therefore, while fluctuations in the continuum 
subtraction will add to the uncertainty in the measurement, 
they are substantially smaller than the 21cm fluctuations of 
interest. 

In summary there are two aspects of the problem. First, 
there is the question of the distribution of overdensities 
(smoothed on the scale of the 21cm beam) in the IGM sur- 
rounding high redshift galaxies. The mean of this distribu- 
tion must be significantly in excess of zero for any correlation 
of Lyev galaxies with large scale fluctuations in 21cm emis- 
sion to be detectable. Second, there is the question of the 
sensitivity of the radio array, and the difference in average 
brightness temperature between regions (on the scale of the 
beam) that occupy or do not occupy galaxies. We treat each 
of these issues in turn. 



5.1 Lya emitters as tracers of overdense regions 

Strong clustering of massive sources in overdense regions 
implies that these sources should trace the higher density 
regions of IGM. In this section we compute the distribution 
of overdensities on a scale R that are centered on galaxies of 
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8( R beam) 

Figure 6. Left panel: The measured error in brightness temperature between regions that contain or do not contain Lyo emitters as a 
function of the fraction of galaxies included. The four lines correspond to splitting up the Lyo survey into A?bin = 2,3 and 4 redshift bins 
(short dashed, long dashed and dot-dashed respectively). Right panel: Comparison between the measurement error in ST at the mean 
overdensity corresponding to observed Lya emitters, with our model for the level of variation of temperature with overdensity (C = 10). 
The errors have been plotted around the mean expected from the model. The 3 sets of points correspond to splitting up the Lya survey 
into -/Vbin = 2, 3 and 4 redshift bins, and the values shown correspond to the minimum value of ATdiff as a function of the fraction of 
the brightest galaxies used, -F ga i s . The line styles are as per the left panel. The results were computed assuming experimental parameters 
that correspond to one SDF, combined with 1000 hours of integration on the MWA-LFD. 



mass M. These overdensities are larger than average since 
galaxies preferentially form in overdense regions. 

The likelihood of observing a galaxy at a random loca- 
tion is proportional to the number density of galaxies. At 
small values of large scale overdensity 8, this density is pro- 
portional to [1 + Sb(M,z)]. More generally, given a large 
scale overdensity 8 on a scale R, the likelihood of observing 
a galaxy may be estimated from the Sheth-Tormen (2002) 
mass function as 



AW = 



(1 + S)u(l + v- 



2 72 



(13) 



9(1 + v-2p)e- ai>2 / 2 

where u = (5 C - 8)/[a{M)] and v = S c /[a(M)]. Here a{M) 
is the variance of the density field smoothed with a top-hat 
window on a mass scale M at redshift 2, and a = 0.707 
and p = 0.3 are constants. Note that here as elsewhere in 
this paper, we work with overdensities and variances com- 
puted at the redshift of interest (i.e. not extrapolated to 
2 = 0). Equation I13H is simply the ratio of the number 
density of halos in a region of over-density 8 to the number 
density of halos in the background universe. This ratio has 
been used to derive the bias for small values of 8 (Mo & 
White 1996; Sheth, Mo & Tormen 2001). For example, in 
the Press-Schechter (1974) formalism we write 



C B (S) = (l + S) 



dn 

1m { 



d 2 n 
dMdu 



] dS S 



dn . 



1 + 5 1 + 



a(M)i> 



1 + 56, 



(14) 



where (dn/dM)(p) and (dn/ dM)(y) are the average and 
perturbed mass functions, and b is the bias factor. Utilising 
Bayes theorem, we find the a-posteriori probability distribu- 
tion for the overdensity 8 on the scale R given the locations 
defined by a galaxy population. We obtain 

dP 

~dJ 



gal 



HP ■ 

n ( r\ ** prior 



(15) 



where 



is a Gaussian of variance a(R). 



In Figure |S] we show the differential (left panel) and cu- 
mulative (right panel) distributions of overdensity <5(-Rbcam) 
at the position of a Lya emitting galaxy at z = 6.57. The 
overdensities refer to a density distribution that has been 
smoothed on a scale -Rbeam, and the Lya emitters were as- 
sumed to have masses of M = 1O 1O M0 (upper row) and 
M = 10 11 M© (lower row). In each case the 4 lines corre- 
spond to values of AT bins = 1, 2, 3 and 4. The angular scales 
corresponding to these cases are # bo am = 8.3', 4.2', 2.8', and 
2.1'. The means of these distributions are each greater than 
zero, with the departure greater in the case of smaller beam- 
size (which has larger intrinsic variance). 

Not every galaxy will be observed in an overdense re- 
gion. However we are interested in correlating the 21cm sig- 
nal with galaxy position. The quantity of interest is therefore 
the distribution of the mean overdensity as measured using 
samples of AT ga i galaxies. For A ga i = 30 and M = 10 10 M Q , 
we find that the mean overdensities sampled by galaxies are 
(8) ga i = 0.037+0.015, 0.10 + 0.02, 0.17+0.03, 0.22 + 0.04 for 
Abins = 1, 2, 3 and 4 respectively. These values are each sig- 
nificantly in excess of zero. Mass conservation implies that 
regions devoid of galaxies must be underdense. 



5.2 The sensitivity to brightness temperature 
fluctuations. 

Results for ATdifr are plotted in the left panel of Figure |S| 
assuming experimental parameters that correspond to one 
SDF, combined with 1000 hours of integration using the 
MWA-LFD. Here three lines are plotted corresponding to 
Abin = 2,3 and 4 (bottom to top). The case of A^in = 1 
leads to a near absence of empty regions, and hence a large 
uncertainty. The Lya emitters in the SDF cover an order 
of magnitude in luminosity. The noise is therefore plotted 
as a function of the fraction of the brightest galaxies used 
(Fgais). The noise level can be reduced as the square-root of 
the integration time, in proportion to the collecting area or 
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number of LFD units, and as the square-root of the number 
of SDF equivalents surveyed. 

First let us suppose that the ionization fraction was uni- 
form across the whole IGM. In this case the brightness tem- 
perature of regions surrounding galaxies would be greater 
than average by a factor (1 + §(<5) ga i). Similarly, those re- 
gions without Lya emitters have (<5} n o g ai < 0, and hence 
brightness temperatures that are lower than average by a 
factor (1 + !(<5}no g ai)- We define the difference in bright- 
ness temperature between regions that contain and do not 
contain galaxies to be ST. The intrinsic (i.e. uniform ioniza- 
tion fraction) difference in brightness temperature between 
regions containing and not containing galaxies would there- 
fore be 



5T in 



22mK:rHi- «<5) g ai - (5) nogal ) 



4mK 



0.3 



0.5 



(16) 



where xm is the average neutral fraction. This offset must 
be subtracted from any measured difference to find the dif- 
ference in ionization state between over and under-dense re- 
gions. Equation 1161 represents a value of ST corresponding 
to an exponent of /3 = in the relation between overdensity 
and neutral fraction (see Equation 0. 

From Equation ||5J we find the uncertainty (A/3) in the 
exponent /3 given an uncertainty in the brightness tempera- 
ture (ATdift) of the overdense regions 



A/3 ~ 0.25 



{AT d 



V lmK 



■' in 
0.5 



( (S) gal - (S) nogal N 



0.4 



(17) 

Figure [(^suggests that ATdif ~ lmK will be achievable with 
first generation instruments, combined with Lya surveys of 
comparable size to those already performed. Given a mea- 
sured uncertainty of ATaia ~ lmK, we can determine the 
exponent f3 to A/3 = ±0.25. This implies that we could 
easily tell the difference between reionization scenarios that 
predict xhi tx (1 + 8)~ s (i.e. the relation similar to that im- 
plied by our model), and a scenario where ionization was 
uniform with xm oc const. 

Finally, as an example, we compare the expected bright- 
ness temperature noise and overdensity estimates (for ex- 
perimental parameters that correspond to one SDF, com- 
bined with 1000 hours of integration using the MWA-LFD) 
to our model calculation of brightness temperature as a func- 
tion of large scale overdensity (with C = 10) . Three curves 
are shown in the right panel of Figure |S| with mock data 
points showing the estimated error over-plotted for compar- 
ison (here the error bars have been computed for the value of 
-Fgais that provides the smallest value of ATdig). The three 
sets of points and curves correspond to splitting up of the 
Lya survey into iVbin = 2, 3 and 4 redshift bins. The model 
curves have been computed assuming a scale for the over- 
densities that corresponds to the beam size (-Rbeam) in each 
case. Note that where iVbins is smaller (~ 2), the beams 
containing galaxies fill roughly half of the survey volume, so 
that (5} g ai ~ — (5) nogal- Conversely, where JVbi ns is larger, 
the beams containing galaxies fill a small fraction of the 
survey volume. In this case (5} ga i > — (<5) n0 gal- This exam- 
ple shows that even at this late stage of reionization, first 
generation surveys will be able to detect the correlation of 
galaxies with 21cm emission. In the future, larger surveys 



and instruments will allow substantial improvements. For 
example, an array with 10 times the MWA-LFD collecting 
area, combined with 4 SDF's would reduce the uncertainty 
in j3 by a factor of ~ 20. 



6 SUMMARY 

In this paper we have calculated the expected cross- 
correlation between the distribution of galaxies and the in- 
tergalactic 21cm emission at high redshifts. We constructed 
a simple model for reionization that accounts for both galaxy 
bias and an enhanced recombination rate in overdense re- 
gions, and used this model to compute the ionization frac- 
tion as a function of large scale overdensity in the IGM. Our 
model predicts that overdense regions will be ionized early 
as a result of their biased galaxy formation. This early phase 
of reionization in overdense regions leads to anti-correlations 
between the 21cm emission and the overdensity of baryons, 
and between the 21cm emission and the overdensity of neu- 
tral hydrogen. In addition, because galaxies are biased to- 
wards overdense regions, our model also predicts an anti- 
correlation between 21cm emission and the galaxy popula- 
tion. 

To explore the detectability of any correlation between 
21cm emission and galaxy properties, we also constructed a 
simple model for the Lya emission corresponding to a dark- 
matter halo of a given mass, and hence for the Lya lumi- 
nosity function. We compared this model to an existing Lya 
survey in the Subaru Deep Field. Through this comparison 
we demonstrated that current surveys probe galaxy masses 
that are larger than 10 10 Mq. Due to their biased formation, 
galaxies of this mass are highly clustered in overdense re- 
gions of the IGM. We have shown that by comparing 21cm 
emission from regions near observed galaxies to those away 
from observed galaxies, future redshifted 21cm observations 
will be able to test the generic prediction that overdense 
regions are reionized first. 
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